#INTRODUCTION AND MODEL CONSTRUCTION
This analysis consists of taking electricity consumption data from 1st of January, 2016 till the 20th of May, 2021 to build and compare alternative forecasting approaches.To make series as stationary as possible,we transform data(weekly,monthly etc.).As final step,We forecast with our model and report the difference with real life data.
Now,we take data from (https://seffaflik.epias.com.tr/transparency/tuketim/gerceklesen-tuketim/gercek-zamanlituketim.xhtml) and manipulate columnames suitable.
## Loading required package: lubridate
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Loading required package: data.table
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:lubridate':
##
## hour, isoweek, mday, minute, month, quarter, second, wday, week,
## yday, year
We add datetime column (mixed column of Hour and Date) and check autocorrelation of Consumption.
acf(el_cons$Consumption)
As first approach,We shall show autocorrelation of hourly data and plot time series of data.
Time series plot seemed a little unreadable.Random part of data gives us more insight about data.Then,decompose time series of data and plot it.
Actually,that still does not make sense with these plots.I prefer getting autocorrelation of detrended data and going on with daily model.
Thanks to dplyr and zoo libraries, we transform our hourly data to daily data.After that step,let us observe autocorrelation of data and plot of time series of data,which should be more readable than before.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:data.table':
##
## between, first, last
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
As we did for hourly data,that is time for data to decompose and plot of that.
Even now,We got better observations of data.Then,we may examine autocorrelation of detrended data.
That table is better than before but still there are some peaks.
Improvement on our way urges me to study with weekly data.The same steps for that ought to be applied.So,after manipulating data for weekly examination,we plot time series of data and decomposition of data.
Plot of random part has the best looking ,so far.We happily take detrended data and display autocorrelation of that.
According to the table,there might be correlation with successive data.Autoregressive part is expected in model we build.
Next observation we make is monthly data.Manipulate and display time series of monthly data below.
That model seems ignoring to me.However,that is better to have look at plot of decomposition and autocorrelation of detrended data anyway.
Although autocorrelation table seems better,random term of weekly comnsumption seems better off. I will ve looking for pattern at every 7 days or multiples of that.
Let us start with building AR models.P parameter are 1,2,3(to be sure of pattern),7,14,21,28 respectively.
##
## Call:
## arima(x = detrend_daily, order = c(1, 0, 0))
##
## Coefficients:
## ar1 intercept
## 0.5150 0.0053
## s.e. 0.0195 0.0924
##
## sigma^2 estimated as 3.89: log likelihood = -4064.32, aic = 8134.65
## [1] 8134.65
## [1] 8151.357
##
## Call:
## arima(x = detrend_daily, order = c(2, 0, 0))
##
## Coefficients:
## ar1 ar2 intercept
## 0.6207 -0.2055 0.0054
## s.e. 0.0222 0.0223 0.0750
##
## sigma^2 estimated as 3.726: log likelihood = -4022.61, aic = 8053.22
## [1] 8053.222
## [1] 8075.498
##
## Call:
## arima(x = detrend_daily, order = c(3, 0, 0))
##
## Coefficients:
## ar1 ar2 ar3 intercept
## 0.6056 -0.1598 -0.0735 0.0054
## s.e. 0.0227 0.0263 0.0227 0.0697
##
## sigma^2 estimated as 3.706: log likelihood = -4017.37, aic = 8044.75
## [1] 8044.749
## [1] 8072.594
##
## Call:
## arima(x = detrend_daily, order = c(7, 0, 0))
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ar7 intercept
## 0.5139 -0.0941 0.0158 -0.0551 -0.0953 -0.0241 0.4000 0.0096
## s.e. 0.0208 0.0239 0.0239 0.0239 0.0239 0.0239 0.0208 0.1143
##
## sigma^2 estimated as 2.925: log likelihood = -3789.11, aic = 7596.23
## [1] 7596.226
## [1] 7646.346
##
## Call:
## arima(x = detrend_daily, order = c(14, 0, 0))
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8
## 0.8950 -0.2517 0.0465 -0.1563 -0.0062 -0.1511 0.8148 -0.8482
## s.e. 0.0227 0.0305 0.0309 0.0309 0.0310 0.0309 0.0242 0.0242
## ar9 ar10 ar11 ar12 ar13 ar14 intercept
## 0.1645 -0.1117 0.0706 -0.081 0.0641 -0.0133 0.0044
## s.e. 0.0308 0.0310 0.0309 0.031 0.0306 0.0228 0.0441
##
## sigma^2 estimated as 1.196: log likelihood = -2926.61, aic = 5885.22
## [1] 5885.224
##
## Call:
## arima(x = detrend_daily, order = c(21, 0, 0))
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8
## 0.9319 -0.2826 0.0450 -0.1473 -0.0237 -0.0999 0.4898 -0.4879
## s.e. 0.0227 0.0310 0.0317 0.0317 0.0318 0.0317 0.0299 0.0308
## ar9 ar10 ar11 ar12 ar13 ar14 ar15 ar16
## 0.0372 -0.0766 -0.0086 -0.0222 -0.0679 0.3610 -0.4744 0.1407
## s.e. 0.0327 0.0327 0.0328 0.0327 0.0328 0.0309 0.0300 0.0318
## ar17 ar18 ar19 ar20 ar21 intercept
## -0.0639 0.0460 -0.0632 0.0554 -0.0264 0.0035
## s.e. 0.0319 0.0318 0.0318 0.0312 0.0228 0.0311
##
## sigma^2 estimated as 1.015: log likelihood = -2768.58, aic = 5583.15
## [1] 5583.153
##
## Call:
## arima(x = detrend_daily, order = c(28, 0, 0))
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8
## 0.9339 -0.2761 0.0417 -0.1450 -0.0430 -0.0591 0.3475 -0.3656
## s.e. 0.0227 0.0311 0.0317 0.0317 0.0319 0.0319 0.0310 0.0313
## ar9 ar10 ar11 ar12 ar13 ar14 ar15 ar16
## 0.0031 -0.0691 -0.0198 -0.0442 -0.0559 0.2132 -0.3095 0.0949
## s.e. 0.0324 0.0323 0.0324 0.0324 0.0324 0.0317 0.0317 0.0324
## ar17 ar18 ar19 ar20 ar21 ar22 ar23 ar24
## -0.0548 -0.0156 -0.0444 -0.0378 0.2826 -0.3416 0.0441 -0.0452
## s.e. 0.0325 0.0325 0.0325 0.0325 0.0314 0.0310 0.0320 0.0320
## ar25 ar26 ar27 ar28 intercept
## 0.0427 -0.0021 0.0122 -0.0254 0.0028
## s.e. 0.0318 0.0318 0.0312 0.0228 0.0233
##
## sigma^2 estimated as 0.9154: log likelihood = -2669.85, aic = 5399.7
## [1] 5399.704
With this AIC and BIC values ,we find second auto regressive model best when p value is 28.AIC value of that model is 5399.704,which is slightly better than that of first model and much better than that of third model.
Implement the procedure for MA.Q values are from 1 to 3 by 1 and from 7 to 28 by 7 respectively.
##
## Call:
## arima(x = detrend_daily, order = c(0, 0, 1))
##
## Coefficients:
## ma1 intercept
## 0.5670 0.0054
## s.e. 0.0222 0.0698
##
## sigma^2 estimated as 3.843: log likelihood = -4052.47, aic = 8110.93
## [1] 8110.934
## [1] 8127.641
##
## Call:
## arima(x = detrend_daily, order = c(0, 0, 2))
##
## Coefficients:
## ma1 ma2 intercept
## 0.6234 0.2265 0.0054
## s.e. 0.0244 0.0281 0.0813
##
## sigma^2 estimated as 3.742: log likelihood = -4026.67, aic = 8061.34
## [1] 8061.339
## [1] 8083.615
##
## Call:
## arima(x = detrend_daily, order = c(0, 0, 3))
##
## Coefficients:
## ma1 ma2 ma3 intercept
## 0.6907 0.2781 -0.3487 0.0054
## s.e. 0.0208 0.0251 0.0199 0.0681
##
## sigma^2 estimated as 3.425: log likelihood = -3941.5, aic = 7893
## [1] 7892.998
## [1] 7920.842
##
## Call:
## arima(x = detrend_daily, order = c(0, 0, 7))
##
## Coefficients:
## ma1 ma2 ma3 ma4 ma5 ma6 ma7 intercept
## 0.7445 0.3417 0.0235 -0.2794 -0.5522 -0.7742 -0.1079 0.0018
## s.e. 0.0225 0.0257 0.0273 0.0319 0.0302 0.0321 0.0287 0.0138
##
## sigma^2 estimated as 2.3: log likelihood = -3557.26, aic = 7132.52
## [1] 7132.519
## [1] 7182.639
##
## Call:
## arima(x = detrend_daily, order = c(0, 0, 14))
##
## Coefficients:
## ma1 ma2 ma3 ma4 ma5 ma6 ma7 ma8 ma9
## 0.7300 0.342 0.0930 -0.1456 -0.3514 -0.4778 0.2117 0.0381 -0.1638
## s.e. 0.0237 0.028 0.0275 0.0302 0.0284 0.0295 0.0311 0.0318 0.0328
## ma10 ma11 ma12 ma13 ma14 intercept
## -0.2665 -0.3315 -0.3626 -0.3861 0.0708 2e-04
## s.e. 0.0323 0.0293 0.0245 0.0423 0.0389 9e-04
##
## sigma^2 estimated as 1.698: log likelihood = -3266.02, aic = 6564.03
## [1] 6564.032
## [1] 6653.134
##
## Call:
## arima(x = detrend_daily, order = c(0, 0, 21))
##
## Coefficients:
## ma1 ma2 ma3 ma4 ma5 ma6 ma7 ma8
## 0.7692 0.3947 0.1281 -0.1436 -0.3772 -0.5065 0.1233 0.0058
## s.e. 0.0238 0.0296 0.0297 0.0319 0.0330 0.0391 0.0395 0.0357
## ma9 ma10 ma11 ma12 ma13 ma14 ma15 ma16
## -0.1171 -0.2335 -0.3200 -0.4610 -0.4446 0.1584 0.0504 0.0795
## s.e. 0.0345 0.0320 0.0297 0.0289 0.0378 0.0373 0.0335 0.0378
## ma17 ma18 ma19 ma20 ma21 intercept
## 0.0656 0.0260 -0.1374 -0.1980 0.1379 1e-04
## s.e. 0.0407 0.0373 0.0286 0.0362 0.0315 9e-04
##
## sigma^2 estimated as 1.474: log likelihood = -3129.7, aic = 6305.39
## [1] 6305.39
## [1] 6433.475
##
## Call:
## arima(x = detrend_daily, order = c(0, 0, 28))
##
## Coefficients:
## ma1 ma2 ma3 ma4 ma5 ma6 ma7 ma8
## 0.8031 0.3946 0.1093 -0.1766 -0.4005 -0.5339 0.1064 0.0270
## s.e. 0.0257 0.0311 0.0381 0.0341 0.0398 0.0134 0.0237 0.0389
## ma9 ma10 ma11 ma12 ma13 ma14 ma15 ma16
## -0.1385 -0.2491 -0.3590 -0.4846 -0.4813 0.1576 0.1024 0.0924
## s.e. 0.0291 0.0364 0.0298 0.0397 NaN NaN 0.0459 0.0220
## ma17 ma18 ma19 ma20 ma21 ma22 ma23 ma24 ma25
## 0.0984 0.0303 -0.1199 -0.2178 0.1801 0.0491 0.0021 0.0137 0.0104
## s.e. 0.0285 0.0160 0.0455 NaN NaN 0.0530 0.0264 0.0341 0.0209
## ma26 ma27 ma28 intercept
## 0.0045 -0.0370 0.0193 1e-04
## s.e. 0.0330 0.0269 0.0181 8e-04
##
## sigma^2 estimated as 1.423: log likelihood = -3095.93, aic = 6251.86
## [1] 6251.864
## [1] 6418.931
When q parameter is 28 ,that gives the best AIC value in all of them.That is actually better than value of moving average model,which is 6251.86.
We shall build autoregressive (AR) and moving average (MA) model based on our previous p and q values that are 28 and 28. Go for (p,q)=(28,28)
##
## Call:
## arima(x = detrend_daily, order = c(28, 0, 28))
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8
## 0.6522 -0.1385 0.1864 -0.0178 0.0932 0.0907 -0.5712 0.6065
## s.e. NaN 0.1376 NaN NaN 0.0735 NaN NaN 0.0185
## ar9 ar10 ar11 ar12 ar13 ar14 ar15 ar16
## -0.3533 0.0579 0.0079 -0.0722 -0.1659 0.6066 -0.5399 0.2713
## s.e. 0.0627 0.1095 0.1072 NaN 0.1093 0.0402 NaN 0.1220
## ar17 ar18 ar19 ar20 ar21 ar22 ar23 ar24 ar25
## -0.0241 -0.0095 0.0043 0.0481 0.8194 -0.7366 0.2039 -0.237 0.0025
## s.e. 0.1094 0.0575 0.0856 0.1027 NaN NaN 0.0946 NaN 0.1371
## ar26 ar27 ar28 ma1 ma2 ma3 ma4 ma5
## -0.0415 0.0111 0.1273 0.2829 0.1253 -0.0966 -0.2398 -0.3652
## s.e. NaN 0.0904 NaN NaN NaN 0.0254 NaN 0.0235
## ma6 ma7 ma8 ma9 ma10 ma11 ma12 ma13
## -0.4636 0.2879 -0.2842 -0.0463 -0.0258 -0.1097 -0.0143 0.1553
## s.e. NaN 0.0565 NaN 0.0619 NaN NaN 0.0477 0.0377
## ma14 ma15 ma16 ma17 ma18 ma19 ma20 ma21
## -0.4219 0.0621 -0.0633 -0.1003 -0.0183 -0.0321 -0.0326 -0.8960
## s.e. NaN 0.0325 0.0308 0.0530 NaN 0.0748 NaN 0.0256
## ma22 ma23 ma24 ma25 ma26 ma27 ma28 intercept
## -0.0608 0.0041 0.2155 0.335 0.3669 0.3532 0.0834 0e+00
## s.e. NaN 0.0260 NaN NaN 0.0598 NaN 0.0390 2e-04
##
## sigma^2 estimated as 0.7105: log likelihood = -2441.39, aic = 4998.79
Increase q by 7.(p,q)=(2,35)
##
## Call:
## arima(x = detrend_daily, order = c(2, 0, 35))
##
## Coefficients:
## ar1 ar2 ma1 ma2 ma3 ma4 ma5 ma6
## 1.2402 -0.9898 -0.4837 0.4497 0.4780 0.1795 -0.0433 -0.2185
## s.e. 0.0050 0.0065 0.0272 0.0303 0.0302 0.0261 0.0236 0.0271
## ma7 ma8 ma9 ma10 ma11 ma12 ma13 ma14
## 0.4165 -0.6018 -0.1595 -0.0526 -0.1441 -0.2597 -0.4068 0.3703
## s.e. 0.0290 0.0350 0.0422 0.0395 0.0374 0.0314 0.0362 0.0359
## ma15 ma16 ma17 ma18 ma19 ma20 ma21 ma22
## -0.4962 -0.0879 0.0753 -0.0239 -0.0567 -0.2744 0.4944 -0.2951
## s.e. 0.0486 0.0531 0.0508 0.0519 0.0422 0.0441 0.0331 0.0430
## ma23 ma24 ma25 ma26 ma27 ma28 ma29 ma30
## -0.0759 0.1214 -0.0058 0.0099 -0.2722 0.3793 -0.1809 -0.0446
## s.e. 0.0477 0.0469 0.0518 0.0466 0.0454 0.0304 0.0345 0.0345
## ma31 ma32 ma33 ma34 ma35 intercept
## 0.0784 0.0712 0.0813 -0.1672 0.1877 0.0004
## s.e. 0.0383 0.0384 0.0364 0.0358 0.0314 0.0011
##
## sigma^2 estimated as 1.134: log likelihood = -2876.5, aic = 5831
(p,q)=(2,21)
##
## Call:
## arima(x = detrend_daily, order = c(2, 0, 21))
##
## Coefficients:
## ar1 ar2 ma1 ma2 ma3 ma4 ma5 ma6
## 0.5944 -0.074 0.1756 -0.0148 -0.0624 -0.1751 -0.2484 -0.3020
## s.e. 0.0861 0.078 0.0831 0.0804 0.0583 0.0381 0.0309 0.0302
## ma7 ma8 ma9 ma10 ma11 ma12 ma13 ma14
## 0.3909 -0.1035 -0.1748 -0.2047 -0.1964 -0.2667 -0.1949 0.3873
## s.e. 0.0464 0.0450 0.0264 0.0290 0.0351 0.0343 0.0380 0.0398
## ma15 ma16 ma17 ma18 ma19 ma20 ma21 intercept
## -0.0865 -0.0173 -0.0342 -0.0133 -0.1042 -0.0813 0.3265 1e-04
## s.e. 0.0405 0.0275 0.0277 0.0232 0.0256 0.0273 0.0235 7e-04
##
## sigma^2 estimated as 1.454: log likelihood = -3117.15, aic = 6284.3
(p,q)=(2,42)
##
## Call:
## arima(x = detrend_daily, order = c(2, 0, 42))
##
## Coefficients:
## ar1 ar2 ma1 ma2 ma3 ma4 ma5 ma6 ma7
## 0.4594 0.0552 0.3779 0.0085 -0.0663 -0.1940 -0.2903 -0.343 0.2315
## s.e. 0.1989 0.1322 0.1978 0.1984 0.1341 0.0714 0.0468 0.061 0.1028
## ma8 ma9 ma10 ma11 ma12 ma13 ma14 ma15
## -0.0328 -0.1840 -0.2050 -0.2259 -0.2635 -0.2588 0.3087 0.0123
## s.e. 0.0676 0.0327 0.0537 0.0728 0.0818 0.0935 0.1080 0.0689
## ma16 ma17 ma18 ma19 ma20 ma21 ma22 ma23
## -0.0411 0.0043 0.0168 -0.0814 -0.1299 0.4255 0.0469 -0.0402
## s.e. 0.0359 0.0369 0.0337 0.0380 0.0462 0.0428 0.0717 0.0637
## ma24 ma25 ma26 ma27 ma28 ma29 ma30 ma31
## -0.0147 -0.0413 -0.0706 -0.0940 0.4005 0.0484 -0.0303 -0.0694
## s.e. 0.0449 0.0327 0.0395 0.0463 0.0391 0.0638 0.0572 0.0447
## ma32 ma33 ma34 ma35 ma36 ma37 ma38 ma39
## -0.0794 -0.1049 -0.1095 0.2658 0.0004 -0.0225 -0.0328 -0.0836
## s.e. 0.0346 0.0444 0.0451 0.0422 0.0497 0.0458 0.0408 0.0291
## ma40 ma41 ma42 intercept
## -0.1611 -0.0895 0.2119 1e-04
## s.e. 0.0359 0.0447 0.0484 8e-04
##
## sigma^2 estimated as 1.146: log likelihood = -2888.57, aic = 5869.13
(p,q)=(2,14)
##
## Call:
## arima(x = detrend_daily, order = c(2, 0, 14))
##
## Coefficients:
## ar1 ar2 ma1 ma2 ma3 ma4 ma5 ma6
## 0.5722 -0.1239 0.1612 0.0215 -0.0300 -0.1617 -0.2444 -0.3347
## s.e. 0.1859 0.0836 0.1845 0.1314 0.0696 0.0374 0.0380 0.0486
## ma7 ma8 ma9 ma10 ma11 ma12 ma13 ma14
## 0.4192 -0.1193 -0.2015 -0.1988 -0.1895 -0.1714 -0.2336 0.2831
## s.e. 0.0807 0.0485 0.0262 0.0470 0.0603 0.0558 0.0610 0.0677
## intercept
## 1e-04
## s.e. 8e-04
##
## sigma^2 estimated as 1.689: log likelihood = -3261.07, aic = 6558.14
Struggling for finding best model is kind of moving parameters up and down to get better AIC values.That leads us to model,p and q values of which are 28 and 28 respectively.AIC value of that is 4998.79.
#FORECASTING
All the work we have done is now used to forecast data that already got to test how good our model is. Y(t)-Ŷ(t)=residuals We now have Y(t) and residuals.Simple math calculations lead us to Ŷ values.
After modeling for random part and electricity consumption part,We display the plot of random part with points of our model on it
and plot of consumption itself.
We forecast the 14 days from 6th of May 2021 and 20th of May 2021 although we already had real data in order to see how good our model is. We build model with our ARIMA model and turn that into time series data.Trend value is from last trend value of daily decomposition and seasonal parameter is got with the same way.
In spite of not having understanding of instantaneous decline,our forecast seems nice.For statistical data ,we shall use WMPA of our forecast.
## Time Series:
## Start = c(66, 3)
## End = c(66, 17)
## Frequency = 30
## [1] 12.105934 11.901525 11.635454 10.622005 11.292832 11.352290 9.907331
## [8] 8.473307 8.799297 9.181065 9.533440 11.413106 12.030311 11.673463
## [15] 12.247834
According to absolute percentage of error for each date,our error is under 15 percentage error for all days,which proves that our model is good enough.